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Autonomous Free Flight Demo 



Autonomous Free Flight Demo 



Problem Background 


• NASA, Uber and Airbus have been exploring the exciting new concept of 
Urban Air Mobility (UAM) [1-5], where the electric vertical takeoff and 
landing (eVTOL) aircraft will be autonomous for on-demand air taxi. 

• The UAM operations are expected to fundamentally change cities and 
people’s lives. 

• In this paper we will combine the power of the free flight idea and onboard 
aircraft intelligence to enable safe and efficient flight operations. 



Image Courtesy: NASA UAM 




Problem Background 


• According to previous research on free flight [6], comparing with structured 
airspace, free flight with airborne separation is able to 

• handle a higher traffic density. 

• have time efficiency. 

• The key to the success of the free flight is the real-time onboard 
computational guidance algorithm with automated conflict detection and 
resolution capability. 



Problem Description 


• Can we design a real-time computational guidance algorithm with collision 
avoidance capability to enable free flight enroute operations in urban air 
mobility? 



Previous Work 


• The centralized guidance or path planning algorithms generally assume 
complete knowledge of the world and in return provide a complete path to 
the destination. 

• Centralized methods, which can be based on optimization technique (MILP 
[7], MIQP [8], SCP [9], GA [10], PSO [11], etc.) or grid-based graph search 
algorithm (RRT [12], A* [13]) usually generate the whole trajectory by 
solving one large optimization problem. 

• Centralized methods can usually find the optimal solution, but it 

• can be computationally prohibitive for large fleets. 

• needs to solve the problem multiple times as new information comes. 
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Previous Work 


• Decentralized guidance or path planning algorithms solve the conflict by each 
aircraft individually, based on the local information it gathered. 

• Decentralized methods scale better with respect to the number of agents and 
are more robust against single point of failure [14]. 

• These methods include Potential Field algorithm [15], Deep Reinforcement 
Learning [16] and Geometric Approach [17]. 

• I will use the Monte Carlo Tree Search method to solve this problem. 
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Artificial Intelligence Success Stories 


• Atari Games 

• Robotics 

• Autonomous Driving 

• Knowledge and Reasoning 

• Healthcare 

• Dialogue Systems 



Image courtesy: Google DeepMind - DQNBreakout 
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Image courtesy: UC Berkeley Robot Learning Lab 
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Image courtesy: Elektrobit (EB) 
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Stories 


Who is the president of United States? 


I don't know. Can you tell me? 


Donald Trump is the president of United 
States. 


I will remember that. 


President of USA? 


Donald Trump 


Image courtesy: Brainasoft 







AlphaGo 



• In March 2016, AlphaGo beat Lee Sedol in a five-game match, the first time 
a computer Go program has beaten a 9-dan professional. 

• Monte Carlo Tree Search is the key part of the algorithm in AlphaGo. 


Image courtesy: Google DeepMind - AlphaGo 
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Our Solution Methods 


• Our plan is to formulate this computational guidance problem with collision 
avoidance function as a Markov Decision Process (MDP)[18] and solve this 
MDP using Monte Carlo Tree Search (MCTS)[19]. 

• The goal of MDP is to maximize the reward by choosing actions optimally at 
each time step. 

• The main idea of MCTS is to judge the reward of an action by simulations 
and building a search tree according to the simulation results. 




Image courtesy: [20] 


Image Courtesy: [19] 













Problem Assumptions 


• All the aircraft fly at the same altitude. 

• All the intruders fly straight at a constant speed. 

• The ownship has the perfect sensor without measurement error. 
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Markov Decision Process Formulation 


• State: the position and velocity of all the aircraft, heading angle of ownship, 
and the position of the goal. 

• Action: for each time step (1 second), the ownship choose to change the 
heading angle at a certain rate. The action space is 


A = {—2°/s, 0°/s, 2°/s} 


• Reward: 

• +1 if the ownship arrives at the goal state 

• 0 if there is any conflict between ownship and any intruder 

• This process will terminate with reward 0 when there is a conflict. 



Example of a state for MDP 



• For the above figure, the state will simply be 

5 — (/x? (yo ivxi ivy-i Oxi Oyi Ovxi ®vyi 0(f)i i?y) 
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Markov Decision Process Formulation 


• State transition: 

• For a state 

S — (tx; iy} ivx: ivy-) Oxi Oy: Ovxi ®vy i 0$} &xi Sy') 

• Intruder information can be decided through the assumption (they fly 
straight at a constant speed). 

• Goal information won't change if the ownship doesn't arrive the goal 
position. 

• Ownship information will be updated using kinematic based differential 
system. 
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Markov Decision Process Formulation 


• State transition: 

• For a state 
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• Intruder information can be decided through the assumption (they fly 
straight at a constant speed). 

• Goal information won't change if the ownship doesn't arrive the goal 
position. 

• Ownship information will be updated using kinematic based differential 
system. 



Dynamic Model 


• The kinematic model of the ownship is of the form: 

x =v cos (j) 
y =v sin 0 


• We will use the discretized version of the above model to update the ownship 
information: 

o\ = 00 + aAt 

°'vx = v C0S °0 
°'vy = v sin 00 

°'x = °X + °'vx 
°'y = °y + °'vy 
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Monte Carlo Tree Search Algorithm 


• Assuming at time t, we are in state s t , we need to iterate the following four 
steps to find the optimal action for current state in real time. 

• Selection step: when we are at a state we have seen, select the action j to 
maximize 


Xj + 2 



where Xj is the mean action value for action j, and n is the number of times 
the current state has been visited, rij is the number of times that action j has 
been used. In this way we can balance exploitation and exploration. 



Monte Carlo Tree Search Algorithm 


• Expansion step: when we meet a new state, one random action will be 
added to expand the tree. 

• Simulation step: a simulation is run from the new state to a terminal state 
according to random policy, and then produce a reward for this terminal state. 

• Backpropagation step: the simulation result is “backed up" through the 
selected states to update their value. 



Monte Carlo Tree Search Algorithm 
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Monte Carlo Tree Search 
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Monte Carlo Tree Search 
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Monte Carlo Tree Search 



Depth=2 
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Monte Carlo Tree Search Algorithm 



We use 1 — ^dfog) to represent the value of these non-terminal states (depth=2). 
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Monte Carlo Tree Search Algorithm 



S t+2 S t+2 

0 0.7 0.8 


We use 1 — to represent the value of these non-terminal states (depth = 
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Monte Carlo Tree Search Algorithm 
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Monte Carlo Tree Search Algorithm 



We use 1 — ^dfog) to represent the value of these non-terminal states (depth=2). 
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Monte Carlo Tree Search Algorithm 







Experiment Setup 



Result 



Conflict: when two aircraft are closer than 320m. 

NMAC: when two aircraft are closer than 30m. 

□ ► « ► «!►<!► I -OQ.O 


Xuxi Yang (xuxiyang@iastate.edu) ICRAT 2018, Barcelona, Spain June 29, 2018 37 / 46 


number of reaching goals 







Conclusion 


• We proposed a computational guidance algorithm with collision avoidance 
capability for autonomous on-demand free flight operations in urban air 
mobility. 

• We formulate this problem as a Markov Decision Process (MDP) and solve it 
using Monte Carlo Tree Search (MCTS) algorithm. 

• Simulation results show this algorithm has promising performance. 

• The contribution of this research is integrating the power of onboard aircraft 
intelligence (vehicle autonomy technology) and the advantage of the free 
flight concept for airspace operations to enable safe and efficient flight 
operations in on-demand urban air transportation. 
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Future Work 


• Allow speed change for this MCTS algorithm. 

• Try to use this MCTS framework to control multiple aircraft simultaneously. 

• Allow higher fidelity dynamics of the aircraft. 

• Incorporate uncertainties in aircraft dynamics and the environment. 
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Autonomous Free Flight Demo (with Speed Change) 



Autonomous Free Flight Demo (Multi-Agent) 
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